منابع مشابه
Historical Documents Modernization
Historical documents are mostly accessible to scholars specialized in the period in which the document originated. In order to increase their accessibility to a broader audience and help in the preservation of the cultural heritage, we propose a method to modernized these documents. This method is based in statistical machine translation, and aims at translating historical documents into a mode...
متن کاملUnsupervised Transcription of Historical Documents
We present a generative probabilistic model, inspired by historical printing processes, for transcribing images of documents from the printing press era. By jointly modeling the text of the document and the noisy (but regular) process of rendering glyphs, our unsupervised system is able to decipher font structure and more accurately transcribe images into text. Overall, our system substantially...
متن کاملMining dates from historical documents
The essential quality of information in a digital library is accessibility. Full text search is not enough for some collections, more can be done. Historical collections, for example, contain dates, and it would be useful to historians to be able to search by them. However, these dates occur anywhere within the text of historical documents, and to be searched they must be extracted from the doc...
متن کاملTowards Ontological Modelling of Historical Documents
In this presentation we describe a methodology we have adopted for coding the semantic structure of a historical document and the resulting semantic model. To do this, we adapted currently available methodologies for ontology engineering to the context of semantic document coding. Using Protégé-2000 we then used this methodology to develop a formal ontological model and finally to encode a hist...
متن کاملEffective Decomposing Approach for Historical XML Documents
Recently, XML is widely used as the de facto standard for data representation and exchanging in Internet. In 2006, office application groups such as OpenOffice.org and Microsoft office both adopted XML as the main data storage format. Historical XML documents often have tiny differences between versions, but are stored individual independent space, so the abilities for efficient storing histori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Prague Bulletin of Mathematical Linguistics
سال: 2017
ISSN: 1804-0462
DOI: 10.1515/pralin-2017-0028